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The internet of things (IoT) has revolutionized connectivity and introduced 
significant security challenges. In this context, intrusion detection systems 
(IDS) play a crucial role in detecting attacks in IoT environments. Bot-IoT 
datasets often face class imbalance issues, with the attack class having 
significantly more samples than the normal class. Addressing this imbalance 
is essential to enhance IDS performance. The study evaluates various 
techniques, including imbalance ratio techniques we call imbalance ratio 
formula (IRF) for controlling imbalance data, while also testing IRF to 
compare it with oversampling techniques like synthetic minority 
oversampling technique (SMOTE) and adaptive synthetic sampling 
(ADASYN). This research also incorporates the extreme gradient boosting 
(XGBoost) ensemble model approach to improve IDS performance in 
dealing with multiclass imbalance issues in Bot-IoT datasets. Through in- 
depth analysis, we identify the strengths and weaknesses of each method. 
This study aims to guide researchers and practitioners working on IDS in 
high-risk IoT environments. The proposed IRF, when integrated with the 
XGBoost algorithm has been demonstrated to achieve comparable accuracy 
of 99.9993% while reducing the training time to be on average at least two 
times faster than those achieved by the other state-of-the-art ensemble 
methods. 
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1. INTRODUCTION 


The internet of things (IoT) has brought about a profound transformation in various sectors, 
providing limitless connectivity between devices and data [1]. However, the benefits of the IoT revolution 
also come with significant security challenges [2]. Intrusion detection systems (IDS) play a crucial role in 
preserving the integrity and security of the IoT environment by detecting potential attacks [3]. The pervasive 
influence of information technology has magnified the role of networks in our daily lives, affecting various 
facets of society. Consequently, network security has become an intricate challenge as technology continues 
to proliferate [4]. The security landscape is not limited to fundamental components like vertical encryption; it 
also encompasses the monitoring of potential intruders within network traffic [5]. This expansion, driven by 
the widespread adoption of technology, bestows substantial benefits upon end-users, including time and 
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effort savings. In this context, the IDS emerges as a critical element capable of identifying and flagging 

suspicious activities within a network or system. When the IDS detects questionable behaviors associated 

with network traffic, it promptly dispatches alerts to network or system administrators [6]. Notably, the Bot- 

IoT dataset, the latest addition to intrusion detection datasets designed for IoT environments, has been 

meticulously crafted at the University of New South Wales (UNSW) Canberra Cyber Range Lab to create 

realistic testing environments for IoT scenarios. This dataset encompasses a range of attack types, including 
distributed denial of service (DDoS), denial of service (DoS), reconnaissance, normal, and theft, reflecting 

the evolving landscape of network security challenges [7]. 

IDS is used to detect hacking activities in computer systems and detect suspicious incoming and 
outgoing network traffic activities [8]. Many researchers use datasets for this IDS such as knowledge 
discovery in database-(KDD-CUP'99) [9], where this dataset is built based on data that has been taken in the 
DARPA'98 IDS evaluation program. Network security laboratory-knowledge discovery and data mining 
(NSL KDD) dataset [10] this dataset is a collaboration of several researchers in the KDD project and tries to 
improve the shortcomings of the KDD CUP99 dataset. UNSW-NB15 dataset [11], this dataset was developed 
by a team of researchers at the Australian center for cyber security (ACCS) within the UNSW. The Bot-IoT 
dataset, referenced in this study, represents an updated version that succeeds the UNSW-NBIS dataset, 
reflecting its more recent nature [12]. 

Addressing the complexity of multiclass imbalanced data, particularly in the Bot-IoT dataset, 
presents a critical challenge in IDS development. Various approaches have been explored, with the imbalance 
ratio technique gaining attention for its focus on balancing majority and minority classes through adjustments 
based on a calculated ratio. The suggested utilization of the imbalance ratio formula (IRF) aims to impart 
greater weight to classes with fewer instances during model training, specifically addressing the challenges in 
the Bot-IoT dataset [13]. Complementing this technique, oversampling methods synthetic minority 
oversampling technique (SMOTE) and adaptive synthetic sampling (ADASYN) have demonstrated efficacy 
in improving minority class representation. This research significantly contributes to managing multiclass 
imbalance by integrating the imbalance ratio technique and oversampling methods, providing valuable 
insights for IDS development. 

a. A comprehensive evaluation of the IRF technique, oversampling methods (SMOTE and ADASYN), and 
the extreme gradient boosting (XGBoost) ensemble model in a multiclass scenario to enhance IDS 
performance. 

b. A new class IRF controls the weights of each class, emphasizing more on classes with fewer instances 
during model training. 

c. A collaborative model for intrusion detection by integrating the proposed class IRF with XGBoost 
ensemble model for IoT environment with imbalance data. 

This paper consists of four main sections: section 2 provides a related work on IDS and imbalance 
ratio, with a discussion on techniques to solve it. Section 3 outlines the method. Section 4 result and 
discussion presents the experimental results, including the evaluation metrics used. Finally, section 5 
summarizes the study's findings, outlines limitations and future research directions, and highlights the 
practical implications of the proposed method. 


2. RELATED WORK 

IDS play a crucial role in identifying and preventing unauthorized access to computer networks. 
Class imbalance, characterized by a disproportionate number of normal instances compared to intrusion 
instances, poses a persistent challenge in the effectiveness of these systems. Researchers have actively 
investigated and implemented diverse strategies, with a notable emphasis on leveraging imbalance ratio 
techniques to enhance the accuracy and reliability of IDS in detecting potential security threats. 

In parallel, oversampling methods have been widely employed to augment the representation of 
minority classes in the dataset. According to Popoola et al. [14] an algorithm for botnet attack detection 
based on deep learning that efficiently handles highly imbalanced network traffic data by utilizing the 
SMOTE to balance classes and employing the deep recurrent neural network (DRNN) for learning 
discriminative features from the balanced data. SMOTE is utilized to generate synthetic samples for the 
positive class with the aim of achieving a balanced training dataset [15]. However SMOTE is generally 
exposed to overfitting problems as it tempts to produce similar samples to existing instances. To address this 
issue and push the boundaries of current practices, this paper introduces a new oversampling technique called 
SMOTE-NaN-DE, which combines SMOTE-based synthetic sample generation with a natural neighbor- 
based error detection method to enhance class-imbalanced data [16]. Another algorithm proposed in literature 
that offers improved technique to cater imbalance problem without major overfitting effects is ADASYN, 
which has implemented to address the imbalanced datasets such as in NSL-KDD [17]. Using ADASYN 
balances sample distribution, preventing the model from favoring large samples and neglecting smaller ones 
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[18]. An IDS utilizing ADASYN oversampling and LightGBM is evaluated on the NSL-KDD, 
UNSW-NB15, and CICIDS2017 datasets [19], however ADASYN is computationally more extensive as 
compared to SMOTE. Furthermore, there are other studies focused on multiclass scenarios as well, such as in 
[20]. This study introduces and assesses an IDS model based on recurrent neural network (RNN), long short- 
term memory (LSTM), and gated recurrent unit (GRU), BotIDS, using a specific Bot-IoT dataset, which 
demonstrates promising results with a validation accuracy of 99.94%, a validation loss of 0.58%, and a 
prediction execution time under 0.34 ms. However, this proposed scheme also suffers from relatively higher 
computational complexity. 

The imbalance ratio technique, as detailed by [13], utilizes a specific IRF to mitigate class 
imbalance within datasets. Additionally, research exemplified by [21] underscores the primary objective of 
imbalance ratio techniques, which is to alleviate bias towards the majority class, thereby enhancing system 
efficiency and average accuracy in the realm of imbalanced datasets. The application of these approaches has 
shown promising outcomes in elevating the overall performance of IDS when confronted with class 
imbalance scenarios. 

Despite the notable progress made in prior research to tackle class imbalance in IDS, there exists an 
unexplored potential for comprehensive evaluations that integrate both imbalance ratio techniques and 
oversampling methods, harnessing the capabilities of XGBoost. This study seeks to fill this gap by 
conducting a thorough analysis of the strengths and weaknesses associated with the combined application of 
these approaches. The aim is to contribute valuable insights to the enhancement of IDS capabilities, 
particularly in navigating complex and high-risk environments, thus advancing the field's understanding and 
effectiveness in addressing security challenges. 


3. METHOD 

In response to the security challenges prevalent in the IoT environment, we present a tailored 
method designed explicitly for intrusion detection. Our proposed approach involves the implementation of an 
IDS customized to accommodate the unique characteristics of IoT. This method is strategically crafted to 
offer robust protection against the evolving and intricate security threats that arise within the dynamic IoT 
ecosystem. By addressing the specific challenges of the IoT landscape, our proposed methodology aims to 
enhance the overall security posture and resilience of IoT devices and systems. 

In Figure 1, the implementation of the analysis using the Bot-IoT dataset is depicted, involving a 
series of preprocessing steps, including data cleaning, data transformation, and feature engineering. To address 
the issue of class imbalance, the application of oversampling techniques using ADASYN and SMOTE is 
employed, along with the utilization of class weight techniques to enhance the role of minority classes in the 
model. Data is then partitioned with an 80% allocation for training and 20% for testing, while the XGBoost 
model is chosen as the primary algorithm. The results of the analysis are evaluated using a confusion matrix, 
providing an in-depth insight into the model's ability to handle class imbalance in the Bot-IoT dataset. 
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Figure 1. Process of constructing the model 
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3.1. Bot-IoT dataset preprocessing 

The Bot-IoT dataset encompasses 46 network traffic features and incorporates three distinct label 
categories tailored for binary, 5-class, and 11-class classifications. A comprehensive description of these 
features and labels can be found in [15]. However, in the course of our analysis, it was discerned that only 37 
out of the 46 features proved relevant for the identification of botnet attacks within IoT networks. 
Specifically, 'pkSeqID,' 'saddr,' 'daddr,' 'proto,' 'state,' 'flgs,' 'sport,' 'dport, and 'subcategory' were notably 
excluded from the analysis. 

The dataset presented in Table 1 consists of various classes, with a total accumulation of 3,659,522 
data entries. The DDoS class is the largest class with 1,926,624 entries, reflecting the most common DDoS 
attacks. The DoS class, with 1,650,260 entries, signifies a significant DoS attack presence. The 
Reconnaissance class has 91,082 entries, indicating network scanning or reconnaissance activities commonly 
associated with attack planning. The normal class has the smallest count with only 477 entries, encompassing 
non-suspicious network traffic. Lastly, the Theft class has 79 entries, suggesting rare attempts at data or 
information theft. Data analysis in this context will aid in developing an effective IDS for identifying and 
safeguarding against various types of attacks. 


Table 1. Distribution of multiclass target category frequencies in the dataset 


No Category Class number 
1 DDoS 1926624 
2 DoS 1650260 
3 Reconnaissance 91082 
4 Normal 477 
5 Theft 79 
Total 3,659,522 


The subcategory distribution in Table 2 includes a variety of classes, with a cumulative total of 
3,659,522 data entries. The user datagram protocol (UDP) subcategory is the largest, with 1,981,230 entries, 
indicating the prevalence of UDP-based network traffic. The TCP subcategory follows with 1,593,180 
entries, representing TCP-based traffic. Service Scan has 73,168 entries, suggesting network scanning or 
service enumeration activities. OS Fingerprint includes 17,914 entries, signifying efforts to identify the 
target system's operating system. HTTP has 2,474 entries, and highlighting web-related traffic. Normal 
comprises 477 entries and reflecting non-suspicious network activities. Keylogging has 73 entries, implying 
instances of keylogger-based attacks. Data Exfiltration is the smallest subcategory, with only 6 entries, 
denoting rare attempts at data exfiltration. Analysis of these subcategories is crucial for developing a 
comprehensive IDS capable of identifying and mitigating various threat types. 


Table 2. Distribution of multiclass target subcategory frequencies in the dataset 


No Subcategory Class number 
1 UDP 1981230 
2 TCP 1593180 
3 Service_Scan 73168 
4 OS_fingerprint 17914 
5 HTTP 2474 
6 Normal 477 
7 Keylogging 73 
8 Data exfiltration 6 
Total 3,659,522 


The combined dataset presented in Table 3 comprises various classes and subcategories, with a total 
accumulation of 3,659,522 data entries. These entries encompass both category and subcategory labels, 
reflecting the diversity of network traffic and activities. The categories include DoS-UDP, DDoS-TCP, 
DDoS-UDP, DoS-TCP,  reconnaissance-service Scan,  reconnaissance-OS Fingerprint, DoS-HTTP, 
DDoS-HTTP, normal-normal, theft-keylogging, and Theft-Data_Exfiltration [22], [23]. The combined 
analysis of these categories and subcategories is essential for the development of an effective IDS capable of 
accurately identifying and mitigating a wide range of potential threats in network traffic data. 


Table 3. Combined dataset class and subcategory distribution 


A comprehensive evaluation of multiclass imbalance techniques with ensemble models ... (Januar Al Amien) 


694 9 ISSN: 1693-6930 


No Label Class number 
1 DoS-UDP 1032975 
2 DDoS-TCP 977380 
3 DDoS-UDP 948255 
4 DoS-TCP 615800 
5 Reconnaissance-Service scan 73168 
6 Reconnaissance-OS Fingerprint 17914 
7 DoS-HTTP 1485 
8 DDoS-HTTP 989 
9 Normal-normal 477 

10 Theft-keylogging 73 
11 Theft-Data_Exfiltration 6 
Total 3,659,522 


3.2. A new imbalance ratio approach 

In order to extend and improve the previous method proposed in binary classifaction [12], this paper 
has proposed a multi-class classification approach with a new imbalance ratio approach. The previous study 
was related to binary classification problems, where the goal was to address the differences between two 
classes. In this study, we have expanded its scope to explore multiclass classification issues. The steps to 
calculate the IRF for a dataset are: 
a. Find the number of samples in each class. 
b. For each class i, calculate the number of samples in the majority class (N;) and the number of samples in 

the minority class (nį). 

c. Calculate the imbalance ratio (IRj) for each class i as (1): 


IR; = Nj [ni (1) 
d. Calculate the IRF value for the dataset as the maximum imbalance ratio across all classes: 
IRF = max(IR, IR,, ...,IR,) (2) 


e. Calculate the average of the values obtained in step b. 
f. Return the result. 
Where k is the total number of classes in the dataset. 

The algorithm used to calculate the Fisher's imbalance ratio IRF for a given dataset. This involves 
several steps: IRF, determining the number of samples in each class, then calculating the number of samples 
in both the majority class (N;) and the minority class (nj) for each class (i). The imbalance ratio (IR;) for 
each class is then computed by dividing N; by n;. The IRF value for the entire dataset is determined by 
selecting the maximum imbalance ratio across all classes. Subsequently, the average of the values obtained in 
the second step is calculated, and the result is returned. This algorithm is applicable to datasets with k total 
classes. 


3.3. Synthetic minority oversampling technique 

SMOTE, is a method for addressing class imbalance by augmenting the minority class. This is 
achieved through the generation of synthetic data points created by replicating and modifying existing 
minority class instances [24]. In the SMOTE process, the over-sampling is carried out by identifying the 
k-nearest neighbors for each minority class instance and then generating synthetic instances that interpolate 
between the selected instance and its neighbors. This approach is advantageous as it helps mitigate excessive 
overfitting issues that may arise when simply replicating minority class instances. By adjusting the class 
distribution in this manner, SMOTE increases the amount of data in the minority class, making it a valuable 
tool for enhancing the performance of machine learning models on imbalanced datasets. 


3.4. Adaptive synthetic sampling 

ADASYN is grounded in the concept of adaptively sampling data for minority classes while 
considering the distribution of these classes. This approach allows ADASYN to dynamically produce 
synthetic data samples for minority classes, mitigating bias stemming from unbalanced data distribution [24]. 
Addressing the issue of imbalanced data can be achieved through various methods, with one effective 
approach being the utilization of the ADASYN method. This technique intelligently generates additional data 
points, particularly for the underrepresented classes, resulting in a more equitable dataset for machine 
learning models. 
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3.5. XGBoost 

XGBoost [25], is a powerful and versatile machine learning algorithm that has gained widespread 
popularity in the data science and predictive modeling community. XGBoost is an ensemble learning method 
that is primarily used for regression and classification tasks. It excels at enhancing the predictive accuracy of 
models by combining the predictions from multiple weak learners (usually decision trees) to form a more 
robust and accurate final prediction. One of XGBoost's key strengths lies in its ability to handle complex 
datasets with a high degree of accuracy, making it a valuable tool for a wide range of applications, including 
image and text classification, as well as structured datac analysis. XGBoost is known for its efficiency and 
scalability, often outperforming other algorithms in both speed and accuracy. Its adaptive gradient boosting 
approach, regularization techniques, and parallel processing capabilities make it a popular choice among data 
scientists for tackling various machine learning challenges. 


LO= 2 xd Givi? + feed) + 00. S 
n 
£O - 2. a Gif?) + gi f(x) + ; hy fE D] + AF, (4) 
i= 
£0 = Y^. (f hf Od] 005), 2) 
. 2 Cie 9)? 
Lis RO Em hi+ 2’ i 
i 10> y 
Gren ee poet felt = 
LV (q) = 2 Lier; Mit A TO ie 
jai 


[Eee Qierg 90^ _ Quergo?] _ 
2 [Xici hitA = XiejighitA Lie hit A 


Lspiit = Y, (8) 


The formula for the XGBoost model represents the objective function used in the training process, 
which aims to minimize the loss and construct a robust ensemble model. In this equation, £(? is the 
cumulative loss at a specific iteration (t), and it is composed of three main components. The IRFst component 
quantifies the difference between the current predicted values $ and the updated values from the previous 
iteration y; (^^ based on the model's predictions f,. The second component introduces regularization terms 
(Q2) to control the complexity of the model and avoid overfitting. The third component combines gradient (g) 
and Hessian (h) statistics to guide the construction of individual trees in the ensemble. The optimal weights 
wj for each leaf node are calculated, and the overall loss LO (q) is computed based on these weights and 
other hyperparameters, such as the regularization term 4 and a shrinkage factor y, where Ij, Ig and I; are the 
sets of instances of the left, right and j-th leaf respectively. Additionally, a splitting criterion £j, is 
employed to determine the optimal tree structure while minimizing the overall loss. These equations 
demonstrate the mathematical foundation behind the XGBoost algorithm, which efficiently combines 
decision trees into a powerful ensemble model for predictive tasks. 


3.6. Evaluating confusion matrix 

According to Liu et al. [26], a confusion matrix is a fundamental tool in evaluating the performance 
of a classification model. It provides a comprehensive summary of the model's predictions by categorizing 
them into four different outcomes: true positives (correctly predicted positive instances), true negatives 
(correctly predicted negative instances), false positives (incorrectly predicted positive instances), and false 
negatives (incorrectly predicted negative instances). These metrics offer valuable insights into the model's 
performance, beyond accuracy alone, particularly in scenarios with imbalanced datasets where one class 
significantly outnumbers the other. Among these metrics, precision holds a special significance. It measures 
the reliability of the model's positive predictions and represents the proportion of true positive predictions 
relative to all instances classified as positive, regardless of their actual correctness. A high precision value 
indicates a system's ability to make trustworthy positive predictions, making it a critical metric to optimize in 
situations where the cost of false positives is high, and reliability is paramount. The model's performance was 
evaluated using the confusion matrix, accuracy, precision, recall, and F1 score for binary classification, and 
recall and false positive rate (FPR) for multiclass classification [8]. 
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These formulas represent essential performance metrics for evaluating classification models: 

a. Precision: it measures the accuracy of positive predictions, specifically the ratio of true positives (correct 
positive predictions) to the total instances classified as positive. A high precision indicates fewer false 
positives. 

b. Recall: it gauges the model's ability to capture all positive instances. It's calculated as the ratio of true 
positives to the sum of true positives and false negatives. 

c. Fl Score: this is a harmonic mean of precision and recall, providing a balanced measure of a model's 
overall performance. 

d. False negative rate (FNR): it quantifies the ratio of false negatives (positive instances incorrectly 
predicted as negative) to the total actual positive instances. 

e. FPR or precision of negative predictions: it measures the ratio of false positives (negative instances 
incorrectly predicted as positive) to the total actual negative instances [27]. 

Accuracy: it represents the overall correctness of a model by calculating the ratio of correct predictions (true 

positives and true negatives) to the total number of predictions made. 


4. RESULTS AND DISCUSSION 
4.1. Hardware specifications 

This section presents details regarding the hardware configuration employed in the experiments. The 
system utilized for the study was equipped with an Intel(R) Xeon(R) Gold 6134 CPU operating at 3.20 GHz, 
complemented by 65536 MB of RAM. The system operated within a virtual machine environment, 
employing the Anaconda application running Python version 3.9.16. Graphic experiments were executed on 
the Anaconda platform, leveraging various Python libraries. 


4.2. Feature engineering 

The process of feature engineering in the IDS dataset involves the transformation and adjustment of 
attributes to enhance the modeling and intrusion detection capabilities. Feature engineering encompasses 
various operations such as modifying, eliminating, or combining features, and even introducing new ones 
based on domain expertise or a deeper understanding of the problem. In this context, features categorized as 
Object type, including pkSeqID, flgs, proto, saddr, sport, daddr, dport, state, attack, category, and 
subcategory, are removed to streamline the dataset. Following this pruning, the classification now relies on a 
reduced set of 36 features. Additionally, the remaining categorical attributes will undergo Label Encoding for 
their use in the classification process. 


4.3. Class imbalance handling 

The Table 4 reflects efforts to address class imbalance through the implementation of three methods, 
namely IRF, SMOTE, and ADASYN. The "Class number" column provides insights into the distribution of 
sample numbers in each class, while the "IRF" column evaluates the performance of the IRF method in 
handling class imbalance. The IRF calculation uses the formula (/R. i = N i/n i), which represents the ratio 
between the number of samples in the majority class (N. i) and the minority class (n_i) for each class. The 
final IRF for the entire dataset is obtained by taking the maximum value from all imbalance ratios (/R i). 
Thus, this table offers a comprehensive view of the effectiveness of various methods in addressing class 
imbalance, with a specific focus on the performance of the IRF method. 
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Table 4. Class imbalance handling IRF and oversampling results for label 


No Label Class number IRF SMOTE ADASYN 
1 DoS-UDP 1032975 0,718422 1032975 1032975 
2 DDoS-TCP 977380 0,733577 1032975 1033012 
3 DDoS-UDP 948255 0,741516 1032975 1033006 
4  DoS-TCP 615800 0,832139 1032975 1032993 
5 Reconnaissance-Service_Scan 73168 0,980055 1032975 1033005 
6 Reconnaissance-OS_Fingerprint 17914 0,995117 1032975 1032994 
7 DoS-HTTP 1485 0,999595 1032975 1032973 
8 DDoS-HTTP 989 0,999730 1032975 1032979 
9 Normal-normal 477 0,999870 1032975 1032980 
10  Theft-keylogging 73 0,999980 1032975 1032973 
11 Theft-Data_Exfiltration 6 0,999998 1032975 1032976 


4.4. Utilizing XGBoost models and performance evaluation 

In this study, we utilized the Bot-IoT full dataset, which employs label encoding to convert the 
"category" and "label" columns into numerical values. We also applied the StandardScaler method to 
standardize the feature values. This dataset is then divided into training and testing sets, with 2096 of the data 
allocated for testing, all while preserving a constant random state to ensure consistent reproducibility. 

In our predictive modeling, we harness the power of the XGBoost algorithm, a robust and versatile 
tool known for its excellent performance in classification tasks. The XGBoost model allows us to achieve 
high predictive accuracy and generalization. To fine-tune the model's performance and adapt it to our specific 
dataset, we've employed various hyperparameters. Among these parameters, 'max, depth' plays a crucial role 
in controlling the depth of the individual decision trees within the ensemble, helping to balance model 
complexity and overfitting. Additionally Dhaliwal et al. [28], learning rate' influences the contribution of 
each tree to the final prediction, while 'subsample' and 'colsample bytree' regulate the sampling of training 
data and features, respectively. The 'gamma' parameter serves as a regularization term, controlling the 
reduction in loss required to make a further partition on a leaf node. Careful tuning of these parameters 
allows us to optimize the XGBoost model's performance and tailor it to the unique characteristics of our 
classification task. Furthermore, in the context of our dataset, we have applied the 'scale pos weight 
parameter IRF calculation to further adapt the model to the imbalanced class distribution, ensuring that it 
effectively addresses the class imbalance challenge. 

Table 5 presents a comprehensive breakdown of specific performance metrics for various methodologies 
aimed at mitigating the imbalanced data issue, encompassing Figure 2(a) IRF, Figure 2(b) SMOTE label, and 
Figure 2(c) ADASYN (see in Appendix). The metrics include accuracy, precision, recall, F1 score, and training 
time. Subsequently, Figure 2 visually compares the multiclass classification outcomes of these methodologies 
across six distinct scenarios, each configured with unique settings. The graphical representation emphasizes the 
efficacy of these methodologies in diverse contexts, accentuating noticeable distinctions in training times. 


Table 5. Imbalanced problem: IRF, oversampling, and XGBoost results 


No Label Accuracy Precision Recall fl score Time (sec) 
1 IRF XGBOOST label 0.999993 0.999993 0.999993 0.999993 540 
2 | SMOTE XGBOOST label 0.999999 — 0.999999 0.999999 0.999999 1494 
3 | ADASYN XGBOOST label 0.999999 0.999999 0.999999 0.999999 1716 


The findings suggest that the integration of oversampling techniques such as SMOTE and 
ADASYN in conjunction with the XGBoost algorithm results in a successful resolution of the imbalanced 
data challenge across various contexts. It is imperative to highlight that the training duration varies based on 
the selected approach, with certain techniques necessitating prolonged training times while still yielding 
remarkable results. This underscores the adaptability and robustness of these combined methodologies in 
mitigating challenges associated with imbalanced data. Furthermore, IRF distinguishes itself for its 
efficiency, not only effectively addressing imbalanced data but also reducing classification time to less than 
300 seconds, which is approximately at least twice as fast as those recorded when utilizing other 
conventional oversampling algorithms. This characteristic underscores the effectiveness of IRF as a solution 
for enhancing risk analysis across diverse contexts, ensuring both accuracy and temporal efficiency. 

Table 6 presents a performance comparison among various methods, including IRF XGBOOST label, 
convolutional neural network (CNN) referenced from [20], RNN referenced from [29], and LS-DRNN referenced 
from [23], specifically in the context of imbalanced classes. IRF XGBOOST label exhibits exceptionally high 
results with an accuracy, precision, recall, and F1 score all around 0.999993, while requiring approximately 
540 seconds for training. The CNN model, cited from [20], achieves a high accuracy of 0.99935 with a training 
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time of 1395 seconds. The RNN, cited from [29], shows an accuracy of 0.9820 without detailed precision, recall, 
and F1 score provided. On the other hand, LS-DRNN, referenced from [23], attains an accuracy of 0.9993 with 
precision 0.9687, recall 0.9975, F1 score 0.9822, and a training time of approximately 616.96 seconds. 


Table 6. Performance comparison across 11 classes 


Label Accuracy Precision Recall fl_score Time (sec) 
CNN [20] 0.99935 - - - 1395 
RNN [29] 0.9820 - - - - 
LS-DRNN [23] 0.9993 0.9687 0.9975 0.9822 616.96 
IRF XGBOOST 0.999993 0.999993 0.999993 0.999993 540 


5. CONCLUSION 

This study provides results that illustrate the comparison between the IRF approach and 
oversampling techniques, specifically SMOTE and ADASYN, in addressing class imbalance in Bot-IoT 
datasets. Evaluation results indicate that IRF significantly enhances the performance of IDS. Compared to 
oversampling techniques, IRF proves to be more effective in addressing data imbalance issues. Although 
oversampling, particularly using SMOTE and ADASYN, also yields improvements, IRF demonstrates 
superiority in terms of effectiveness. The integration of IRF with the XGBoost ensemble model produces 
excellent results, showcasing potential adaptability and higher robustness in enhancing IDS performance in 
IoT environments with imbalanced data. As a direction for future research, it is recommended to further 
explore the application of the imbalance ratio technique in various other IoT dataset imbalance scenarios to 
comprehensively understand its performance and applicability. 
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Figure 2. Comparing multiclass classification outcomes across class types: (a) IRF 
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